feat: map maintainers by email if username not found (CM-773)#3598
Conversation
| platform IN ('github', 'git', 'gitlab') | ||
| AND "verified" = TRUE | ||
| AND value = $1 |
There was a problem hiding this comment.
@joanagmaia FYI, I didn't limit the type to email only, because for git, we don't have usernames only email with type username, that should be fine, right?
There was a problem hiding this comment.
Yeah that's ok. Makes sense. And for git yeah we actually have to use type username since type emails are not verified. So all good
There was a problem hiding this comment.
This is the final PR Bugbot will review for you during this billing cycle
Your free Bugbot reviews will reset on December 12
Details
Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
| await find_github_identity(github_username) | ||
| if github_username != "unknown" | ||
| else await find_maintainer_identity_by_email(email) | ||
| ) |
There was a problem hiding this comment.
Bug: Missing Null Checks Break Identity Logic
The logic doesn't handle None values for github_username and email. When github_username is None, the condition github_username != "unknown" evaluates to True, causing find_github_identity(None) to be called instead of falling back to email-based lookup. Similarly, when both are None, the early return check fails since None != "unknown", allowing invalid lookups to proceed. The code only checks for the string "unknown" but the model fields are nullable.
| email = maintainer.email | ||
|
|
||
| if github_username == "unknown" and email == "unknown": | ||
| self.logger.warning("username & email with value 'unknown' aborting") | ||
| return | ||
| identity_id = await find_github_identity(github_username) | ||
| identity_id = ( | ||
| await find_github_identity(github_username) | ||
| if github_username != "unknown" | ||
| else await find_maintainer_identity_by_email(email) | ||
| ) |
There was a problem hiding this comment.
Can you give me a bit more context on this unknown logic?
There was a problem hiding this comment.
The LLM return "unknown" if it didn't manage to extract the expected values. That was the logic from the v1 I didn't change it as it was working fine
| platform IN ('github', 'git', 'gitlab') | ||
| AND "verified" = TRUE | ||
| AND value = $1 |
There was a problem hiding this comment.
Yeah that's ok. Makes sense. And for git yeah we actually have to use type username since type emails are not verified. So all good
This pull request enhances the maintainer extraction and identification process by introducing support for email-based identity lookup as a fallback when a GitHub username is unavailable. It also updates the maintainer data model and extraction prompt to include email addresses, improving the robustness and accuracy of maintainer identification.
Maintainer identity lookup improvements:
find_maintainer_identity_by_emailfunction to search for maintainer identities by email in the database, supporting cases where the GitHub username is unknown.Data model and prompt updates:
MaintainerInfoItemmodel to include anemailfield, ensuring that email addresses are captured and stored for maintainers.emailfield for each maintainer object, specifying extraction instructions and fallback behavior if the email is not found. [1] [2]Note
Adds email to maintainer extraction/model and uses a new verified email lookup to resolve identities when GitHub username is missing.
github_usernameis "unknown"; skip entries where both username and email are "unknown".find_maintainer_identity_by_emailquerying verified identities acrossgithub,git,gitlabinmemberIdentities.MaintainerInfoItemwithemail.emailin each maintainer object with extraction rules and "unknown" fallback.Written by Cursor Bugbot for commit 2253555. This will update automatically on new commits. Configure here.